xen.git
15 years agocredit2: Putting a vcpu to sleep also removes the delayed_runq_add flag
Keir Fraser [Fri, 10 Dec 2010 10:49:20 +0000 (10:49 +0000)]
credit2: Putting a vcpu to sleep also removes the delayed_runq_add flag

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
15 years agox86: x2apic: Large cleanup
Keir Fraser [Thu, 9 Dec 2010 19:19:34 +0000 (19:19 +0000)]
x86: x2apic: Large cleanup

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agoAdd CPU_STARTING notifier during CPU bringup.
Keir Fraser [Thu, 9 Dec 2010 16:17:33 +0000 (16:17 +0000)]
Add CPU_STARTING notifier during CPU bringup.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agox86: time: tsc_set_info() must skip the idle domain.
Keir Fraser [Thu, 9 Dec 2010 16:15:10 +0000 (16:15 +0000)]
x86: time: tsc_set_info() must skip the idle domain.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agoMove IDLE_DOMAIN_ID defn to public header, and change DOMID_INVALID to fix clash.
Keir Fraser [Thu, 9 Dec 2010 10:09:59 +0000 (10:09 +0000)]
Move IDLE_DOMAIN_ID defn to public header, and change DOMID_INVALID to fix clash.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agox86: Simplify tsc_set_info() slightly -- no domain has id DOMID_INVALID.
Keir Fraser [Thu, 9 Dec 2010 09:57:08 +0000 (09:57 +0000)]
x86: Simplify tsc_set_info() slightly -- no domain has id DOMID_INVALID.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agox86:vlapic: Fix possible guest tick losing after save/restore
Keir Fraser [Thu, 9 Dec 2010 08:34:59 +0000 (08:34 +0000)]
x86:vlapic: Fix possible guest tick losing after save/restore

Guest vcpu may totally lose all ticks if the vlapic->pt.irq was not
restored during save/restore process. Fix it.

Signed-off-by: Wei Gang <gang.wei@intel.com>
15 years agoFix xc_cpuid_hvm_policy to avoid guest CPUID feature missing.
Keir Fraser [Thu, 9 Dec 2010 08:34:04 +0000 (08:34 +0000)]
Fix xc_cpuid_hvm_policy to avoid guest CPUID feature missing.

Signed-off-by: Wei Gang <gang.wei@intel.com>
15 years agosched/arinc653: fix another unsigned < 0 comparison
Keir Fraser [Thu, 9 Dec 2010 08:30:30 +0000 (08:30 +0000)]
sched/arinc653: fix another unsigned < 0 comparison
replacing it with a test of the appopriate unsigned max.

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
15 years agox86/mm: change ASSERTs to BUG_ONs in mem_sharing.c
Tim Deegan [Wed, 8 Dec 2010 10:46:31 +0000 (10:46 +0000)]
x86/mm: change ASSERTs to BUG_ONs in mem_sharing.c

These two ASSERTs have important side-effects so make them into BUG_ONs
consistent with the rest of the file.
Bug found by Jui-Hao Chiang <juihaochiang@gmail.com>.

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
15 years agox86: remove BUG_ON() from QUIRK_IOAPIC_*_REGSEL handler
Keir Fraser [Tue, 7 Dec 2010 18:32:04 +0000 (18:32 +0000)]
x86: remove BUG_ON() from QUIRK_IOAPIC_*_REGSEL handler

Since (non-pvops, 32-bit only up to 2.6.27) Linux would report "BAD"
unconditionally on all SiS chipset versions (it only looks for a PCI
device at 0000:00:00.0 with SiS as the vendor), we must not crash if
the report on a 64-bit hypervisor doesn't match the #define (which is
zero).

While we could honor the quirk indication even on 64-bit, it doesn't
seem worthwhile, as there's no evidence that newer SiS chipsets
(supporting 64-bit CPUs) are actually affected.

This should also address bug 1687 (mis-reported, however, afaict).

Signed-off-by: Jan Beulich <jbeulich@novell.com>
15 years agosvm: dump VMCB physical address
Keir Fraser [Tue, 7 Dec 2010 18:30:19 +0000 (18:30 +0000)]
svm: dump VMCB physical address

VMCB physical address is useful for hardware debug. This small patch
dumps VMCB physical address.

Signed-off-by: Wei Huang <wei.huang2@amd.com>
15 years agoamd xsave: Enable SVM intercept for xsetbv instruction.
Keir Fraser [Tue, 7 Dec 2010 18:28:19 +0000 (18:28 +0000)]
amd xsave: Enable SVM intercept for xsetbv instruction.

SVM introduces an intercept control bit for xsetbv instruction. This
patches enables xsetbv intercept for SVM.

Signed-off-by: Wei Huang <wei.huang2@amd.com>
15 years agoamd xsave: Enable XSAVE/XRSTOR for SVM guest
Keir Fraser [Tue, 7 Dec 2010 18:27:40 +0000 (18:27 +0000)]
amd xsave: Enable XSAVE/XRSTOR for SVM guest

This patch creates a common interface hanlding xsetbv.

Signed-off-by: Wei Huang <wei.huang2@amd.com>
15 years agoamd xsave: Move xsave initialization code to a common place
Keir Fraser [Tue, 7 Dec 2010 18:26:38 +0000 (18:26 +0000)]
amd xsave: Move xsave initialization code to a common place

This patch moves xsave/xrstor code to CPU common file. First of all,
it prepares xsave/xrstor support for AMD CPUs. Secondly, Xen would
crash on __context_switch() without this patch on xsave-capable AMD
CPUs. The crash was due to cpu_has_xsave reports true in domain.c
while xsave space wasn't initialized.

Signed-off-by: Wei Huang <wei.huang2@amd.com>
15 years agox86 hvm: x2APIC emulation
Keir Fraser [Tue, 7 Dec 2010 18:24:12 +0000 (18:24 +0000)]
x86 hvm: x2APIC emulation

This patch would enable Xen to handle x2APIC MSR accessing of HVM
guest, which is faster(avoid decoding of MMIO accessing). The credit
comes to Gleb Natapov who complete the work for KVM.

Have tested with 4 vcpus guest, with/without x2apic support.

From: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Keir Fraser <keir@xen.org>
15 years agox86 hvm: Fix VLAPIC TMCCT register when timer is one-shot
Keir Fraser [Tue, 7 Dec 2010 18:10:46 +0000 (18:10 +0000)]
x86 hvm: Fix VLAPIC TMCCT register when timer is one-shot

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agominios: reverse layering of xc vs minios fd close
Keir Fraser [Fri, 3 Dec 2010 06:37:48 +0000 (06:37 +0000)]
minios: reverse layering of xc vs minios fd close

Having minios close() call back into the libxc core close routines is
backwards and unexpected. On every other OS the libxc core close
routine calls close().

Export minios specific functions from the minios libxc code to
implement fd closing for each type of xc file handle and simply call
close() in the core close routine.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
15 years agoMAINTAINERS: Correct Winston Wang's email address.
Keir Fraser [Fri, 3 Dec 2010 06:36:41 +0000 (06:36 +0000)]
MAINTAINERS: Correct Winston Wang's email address.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
15 years agoxen: x86_32p: fix build breakage from 22456:1b6cc8c6d1c7
Keir Fraser [Fri, 3 Dec 2010 06:35:40 +0000 (06:35 +0000)]
xen: x86_32p: fix build breakage from 22456:1b6cc8c6d1c7

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
15 years agox86: Add -Wredundant-decls to Xen build flags.
Keir Fraser [Thu, 2 Dec 2010 09:37:04 +0000 (09:37 +0000)]
x86: Add -Wredundant-decls to Xen build flags.

Fix up the fallout.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agoARINC 653 scheduler
Keir Fraser [Wed, 1 Dec 2010 21:20:14 +0000 (21:20 +0000)]
ARINC 653 scheduler
From: Josh Holtrop <Josh.Holtrop@dornerworks.com>
Signed-off-by: Keir Fraser <keir@xen.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
15 years agox86/IRQ: pass CPU masks by reference rather than by value in more places
Keir Fraser [Wed, 1 Dec 2010 20:12:12 +0000 (20:12 +0000)]
x86/IRQ: pass CPU masks by reference rather than by value in more places

Additionally simplify operations on them in a few cases.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
15 years agox86/IRQ: consolidate declarations
Keir Fraser [Wed, 1 Dec 2010 20:11:30 +0000 (20:11 +0000)]
x86/IRQ: consolidate declarations

Signed-off-by: Jan Beulich <jbeulich@novell.com>
15 years agox86: fix IRQ migration when using directed EOI (broken with c/s 20465)
Keir Fraser [Wed, 1 Dec 2010 20:10:27 +0000 (20:10 +0000)]
x86: fix IRQ migration when using directed EOI (broken with c/s 20465)

In directed-EOI mode, there is no chance to do the migration in
mask_and_ack_level_ioapic_irq(), as the remote IRR bit can't possibly
be clear after issuing the EOI to the LAPIC. Consequently, there's no
point to even try. Instead, migration must be done in
end_level_ioapic_irq(), and it requires masking the interrupt source
prior to issuing the EOI to the IO-APIC.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
15 years agox86 hvm: Do not overwrite boot-cpu capability data on VMX/SVM startup.
Keir Fraser [Tue, 30 Nov 2010 11:34:08 +0000 (11:34 +0000)]
x86 hvm: Do not overwrite boot-cpu capability data on VMX/SVM startup.

Apparently required back in the earliest days of Xen, we now properly
initialise CPU capabilities early during bootstrap. Re-writing
capability data later now causes problems if specific features have
been deliberately masked out.

Thanks to Weidong Han at Intel for finding such a bug where XSAVE
feature is masked out by default, but then erroneously written back
during VMX initialisation. This would cause memory corruption problems
during boot for XSAVE-capable systems.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agox86: fixes after emuirq changes
Keir Fraser [Mon, 29 Nov 2010 17:44:32 +0000 (17:44 +0000)]
x86: fixes after emuirq changes

Signed-off-by: Wei Wang <wei.wang2@amd.com>
15 years agox86: tighten filter on ptwr_do_page_fault()
Keir Fraser [Mon, 29 Nov 2010 14:40:55 +0000 (14:40 +0000)]
x86: tighten filter on ptwr_do_page_fault()

Even not-so-recent Linux may, due to post-2.6.18 changes to the
process creation code, cause quite a number (depending on environment
and argument size) of faulting accesses to user space originating from
kernel mode. Generally those happen for non-present pages and would
lead to a nested page fault from guest_get_eff_l1e(). They can be
avoided by checking for PFEC_page_present as long as the guest isn't
running on shadow page tables.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Keir Fraser <keir@xen.org>
15 years agox86-64: don't crash Xen upon direct pv guest access to GDT/LDT mapping area
Keir Fraser [Mon, 29 Nov 2010 14:34:32 +0000 (14:34 +0000)]
x86-64: don't crash Xen upon direct pv guest access to GDT/LDT mapping area

handle_gdt_ldt_mapping_fault() is intended to deal with indirect
accesses (i.e. those caused by descriptor loads) to the GDT/LDT
mapping area only. While for 32-bit segment limits indeed prevent the
function being entered for direct accesses (i.e. a #GP fault will be
raised even before the address translation gets done, on 64-bit even
user mode accesses would lead to control reaching the BUG_ON() at the
beginning of that function.

Fortunately the fix is simple: Since the guest kernel runs in ring 3,
any guest direct access will have the "user mode" bit set, whereas
descriptor loads always do the translations to access the actual
descriptors as kernel mode ones.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Further, relax the BUG_ON() in handle_gdt_ldt_mapping_fault() to a
check-and-bail. This avoids any problems in future, if we don't
execute x86_64 guest kernels in ring 3 (e.g., because we use a
lightweight HVM container).

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agoxenpaging: increase recently used pages from 4MB to 64MB
Keir Fraser [Fri, 26 Nov 2010 14:25:30 +0000 (14:25 +0000)]
xenpaging: increase recently used pages from 4MB to 64MB

Increase recently used pages from 4MB to 64MB.  Keeping more pages in
memory allows the guest to make more progress if the paging file spans
the entire guest memory.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
15 years agoxenpaging: handle temporary out-of-memory conditions during page-in
Keir Fraser [Fri, 26 Nov 2010 14:23:31 +0000 (14:23 +0000)]
xenpaging: handle temporary out-of-memory conditions during page-in

p2m_mem_paging_prep() should return -ENOMEM if a new page could not be
allocated. This can be handled in xenpaging to retry the
page-in. Right now such condition would stall the guest because the
requested page will not come back, xenpaging simply exits. So
xenpaging could very well retry the allocation forever to rescue the
guest.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
15 years agoxenpaging: print xenpaging cmdline options
Keir Fraser [Fri, 26 Nov 2010 14:22:38 +0000 (14:22 +0000)]
xenpaging: print xenpaging cmdline options

Print xenpaging arguments to simplify domain_id mapping from xenpaging
logfile to other logfiles and Xen console output.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
15 years agoxenpaging: optimize p2m_mem_paging_populate
Keir Fraser [Fri, 26 Nov 2010 14:22:11 +0000 (14:22 +0000)]
xenpaging: optimize p2m_mem_paging_populate

p2m_mem_paging_populate will always put another request in the ring.
To reduce pressure on the ring, place only required requests in the
ring.  If the gfn was already processed by another thread, and the
current vcpu does not need to be paused, p2m_mem_paging_resume will do
nothing with the request.  And also xenpaging will drop the request if
the vcpu does not need a wakeup.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
15 years agoxenpaging: when populating a page, check if populating is already in progress
Keir Fraser [Fri, 26 Nov 2010 14:21:33 +0000 (14:21 +0000)]
xenpaging: when populating a page, check if populating is already in progress

p2m_mem_paging_populate can be called serveral times from different
vcpus. If the page is already in state p2m_ram_paging_in and has a new
valid mfn, invalidating this new mfn will cause trouble later if
p2m_mem_paging_resume will set the new gfn/mfn pair back to state
p2m_ram_rw.  Detect this situation and keep p2m state if the page is
in the process of being still paged-out or already paged-in.  In fact,
p2m state p2m_ram_paged is the only state where the mfn type can be
invalidated.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
15 years agoxenpaging: allow negative num_pages and limit num_pages
Keir Fraser [Fri, 26 Nov 2010 14:20:39 +0000 (14:20 +0000)]
xenpaging: allow negative num_pages and limit num_pages

Simplify paging size argument. If a negative number is specified, it
means the entire guest memory should be paged out. This is useful for
debugging. Also limit num_pages to the guests max_pages.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
15 years agoxenpaging: notify policy only on resume
Keir Fraser [Fri, 26 Nov 2010 14:20:10 +0000 (14:20 +0000)]
xenpaging: notify policy only on resume

If a page is requested more than once, the policy is also notified
more than once about the page-in. However, a page-in happens only
once. Any further resume will only unpause the other vcpu. The
multiple notify will put the page into the mru list multiple times and
it will unlock other already resumed pages too early. In the worst
case, a page that was just resumed can be evicted right away, causing
a deadlock in the guest.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
15 years agoxenpaging: print p2mt for already paged-in pages
Keir Fraser [Fri, 26 Nov 2010 14:19:38 +0000 (14:19 +0000)]
xenpaging: print p2mt for already paged-in pages

Add more debug output, print p2mt for pages which were requested more
than once.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
15 years agoxenpaging: print info when free request slots drop below 2
Keir Fraser [Fri, 26 Nov 2010 14:19:09 +0000 (14:19 +0000)]
xenpaging: print info when free request slots drop below 2

Add debugging aid to free request slots in the ring buffer.
It should not happen that the ring gets full, print info anyway if it
happens.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
15 years agoxenpaging: add signal handling
Keir Fraser [Fri, 26 Nov 2010 14:18:32 +0000 (14:18 +0000)]
xenpaging: add signal handling

Leave paging loop if xenpaging gets a signal.
Remove paging file on exit.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
15 years agoxenpaging: populate paged-out pages unconditionally in grant code
Keir Fraser [Fri, 26 Nov 2010 14:17:56 +0000 (14:17 +0000)]
xenpaging: populate paged-out pages unconditionally in grant code

Populate a page unconditionally to avoid missing a page-in request.
If the page is already in the process of being paged-in, the this vcpu
will be stopped and later resumed once the page content is usable
again.

This matches other p2m_mem_paging_populate usage in the source tree.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
15 years agoxenpaging: allow only one xenpaging binary per guest
Keir Fraser [Fri, 26 Nov 2010 14:17:01 +0000 (14:17 +0000)]
xenpaging: allow only one xenpaging binary per guest

Make sure only one xenpaging binary is active per domain.
Print info when the host lacks the required features for xenpaging.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Already-Acked-by: Patrick Colp <pjcolp@cs.ubc.ca>
Already-Acked-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoxenpaging: Open paging file only if xenpaging_init() succeeds
Keir Fraser [Fri, 26 Nov 2010 14:15:59 +0000 (14:15 +0000)]
xenpaging: Open paging file only if xenpaging_init() succeeds

Open paging file only if xenpaging_init() succeeds. It can fail if the
host does not support the required virtualization features such as EPT
or if xenpaging was already started for this domain_id.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Already-Acked-by: Patrick Colp <pjcolp@cs.ubc.ca>
Already-Acked-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoxenpaging: break endless loop during inital page-out with large pagefiles
Keir Fraser [Fri, 26 Nov 2010 14:15:21 +0000 (14:15 +0000)]
xenpaging: break endless loop during inital page-out with large pagefiles

To allow the starting for xenpaging right after 'xm start XYZ', I
specified a pagefile size equal to the guest memory size in the hope
to catch more errors where the paged-out state of a p2mt is not
checked.

While doing that, xenpaging got into an endless loop because some
pages cant be paged out right away. Now the policy reports an error if
the gfn number wraps.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Already-Acked-by: Patrick Colp <pjcolp@cs.ubc.ca>
Already-Acked-by: Keir Fraser <keir.fraser@citrix.com>
15 years agoAllow assign_irq_vector to return an old vector while moving an irq
Keir Fraser [Fri, 26 Nov 2010 10:10:40 +0000 (10:10 +0000)]
Allow assign_irq_vector to return an old vector while moving an irq

The guest calls assign_irq_vector() to assign one if it doesn't have
one, and to find out the vector if it does have one.

If the cpu mask passed intersects with the existing mask, the old
vector is simply returned.

However, if the irq happens to be in transit at the time, this returns
EBUSY.  This is unnecessary if, as soon as the irq migration succeeds,
the logic would just return the old vector anyway.

This patch makes two changes:
* Switch the checks, so if the mask overlaps it always returns
* Return -EAGAIN instead of -EBUSY for moving irqs, to let the caller
know that the failure is temporary and may work if repeated.

This fixes a bug where on certain hardware, using the credit2
scheduler, a pvops kernel with multiple vcpus doesn't boot.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
15 years agocpupool: Simplify locking, use refcounts for cpupool liveness.
Keir Fraser [Fri, 26 Nov 2010 10:07:57 +0000 (10:07 +0000)]
cpupool: Simplify locking, use refcounts for cpupool liveness.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agox86/mm: remove incorrect BUG_ON.
Tim Deegan [Wed, 24 Nov 2010 10:20:03 +0000 (10:20 +0000)]
x86/mm: remove incorrect BUG_ON.
This BUG_ON tests a property of an effectively random PFN in the guest,
and is explicitly _not_ seeing the MFN that's known to be owned.

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
15 years agotools/xl: show shutdown reason code, improve xl list heading
Ian Jackson [Tue, 23 Nov 2010 19:36:14 +0000 (19:36 +0000)]
tools/xl: show shutdown reason code, improve xl list heading

Previously, xl list would not reveal the shutdown reason code unless
it was SHUTDOWN_crashed.  This is unfortunate; it makes it hard for
scripts which use xl to tell what's going on.

In this patch:

 * xl list shows the reason code as a single letter if it is
   any of the defined values from sched.h:
       -   poweroff or domain not shut down
       r   reboot
       s   suspend
       c   crashed
       w   watchdog
   This is not 100% backward-compatible with xm but I think it's a
   justifiable improvement.  It would be nice to make the same change
   to xm.

 * xl list -v shows the full numeric reason code in hex, or "-" if the
   domain is not shut down.

 * xl list -v has column headings for the UUID and numeric reason
   code.  The heading for the reason code overlaps with the UUID a bit.
   These headings are intended for human readers; scripts can parse
   the output by breaking on whitespace.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agotools: Add xl python bindings for cpumap
Juergen Gross [Tue, 23 Nov 2010 19:35:09 +0000 (19:35 +0000)]
tools: Add xl python bindings for cpumap

Signed-off-by: juergen.gross@ts.fujitsu.com
15 years agotools/xl: only use the special "--incoming" name on actual migration
Tim Deegan [Tue, 23 Nov 2010 19:33:30 +0000 (19:33 +0000)]
tools/xl: only use the special "--incoming" name on actual migration

Only use the special "--incoming" name on actual migration
and not on every susequent reboot.

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agotools/xl: VMs that are started paused shouldn't reboot paused too.
Tim Deegan [Tue, 23 Nov 2010 19:31:29 +0000 (19:31 +0000)]
tools/xl: VMs that are started paused shouldn't reboot paused too.

Otherwise a VM that's been migrated won't ever reboot cleanly again.

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agotools/xl: make it explicit that we migrate from stdin.
Tim Deegan [Tue, 23 Nov 2010 19:30:27 +0000 (19:30 +0000)]
tools/xl: make it explicit that we migrate from stdin.

No semantic changes, just makes things a bit less confusing.

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agotools: fetch remote changesets when force refetching/resetting qemu
Ian Campbell [Tue, 23 Nov 2010 19:29:13 +0000 (19:29 +0000)]
tools: fetch remote changesets when force refetching/resetting qemu

This makes "make tools/ioemu-dir-force-update" usable for picking up
an entirely new QEMU_TAG.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agoocaml: install built modules
Ian Campbell [Tue, 23 Nov 2010 19:28:03 +0000 (19:28 +0000)]
ocaml: install built modules

Previously the install target was having no effect because it ended up
calling the default target in the subdir Makefile instead of the
install target.

Resolve this by tying the tools/ocaml Makefiles into the generic
handling done by tools/Rules.mk.

Other changes arising in one way or another from this:
- Add libs/xl/META.in
- Update .hgignore for META files
- Create leading directories
- Remove existing module before installation in install targer
  (worksaround what appears to be a quirk of "ocamlfind install")
- Use the globally defined $(DESTDIR)
- Move "ocamlfind printfconf destdir" to a common variable,
  repurposing exising unused OCAMLDESTDIR, incorporating $(DESTDIR) at
  the same time.
- Drop a few unused variabe definitions (mainly to avoid deciding if
  $(DESTDIR) made sense for them or not.
- Pass -destdir to ocamlfind in uninstall target for symmetry with
  install target.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agolibxl, xl: Account for shadow memory for PV guests too
Stefano Stabellini [Tue, 23 Nov 2010 19:25:00 +0000 (19:25 +0000)]
libxl, xl: Account for shadow memory for PV guests too

We need to account for the memory needed by shadow pagetables even for PV
guests, because in that case shadow pagetables are used during live
migration.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agotools/libxl: use qdisk if blktap2 is not available
Stefano Stabellini [Tue, 23 Nov 2010 19:23:22 +0000 (19:23 +0000)]
tools/libxl: use qdisk if blktap2 is not available

Whenever blktap2 is not available use qdisk as block backend instead.

[ This feature will only work with the relevant changesets from
  qemu-xen-unstable, recently applied.  -iwj ]

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
15 years agoQEMU_TAG update
Ian Jackson [Tue, 23 Nov 2010 19:21:22 +0000 (19:21 +0000)]
QEMU_TAG update

15 years agoReapply 61c0c52a8c6c "qemu-xen: build adjustments"
Ian Jackson [Tue, 23 Nov 2010 19:12:55 +0000 (19:12 +0000)]
Reapply 61c0c52a8c6c "qemu-xen: build adjustments"

The changeset
  qemu-xen: build adjustments to support out-of-tree builds
works after all.  Sorry for the noise.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agoRevert 61c0c52a8c6c "qemu-xen: build adjustments"
Ian Jackson [Tue, 23 Nov 2010 18:38:16 +0000 (18:38 +0000)]
Revert 61c0c52a8c6c "qemu-xen: build adjustments"

It appears that the changeset
  qemu-xen: build adjustments to support out-of-tree builds
broke the build.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agoqemu-xen: build adjustments to support out-of-tree builds
Jan Beulich [Tue, 23 Nov 2010 16:43:38 +0000 (16:43 +0000)]
qemu-xen: build adjustments to support out-of-tree builds

QEMU by itself can be built outside of its source directory. With the
qemu repository being separate from the hypervisor/tools one it seems
to make sense to make use of this feature, but doing so requires a
couple of adjustments to the Xen changes to it. Basically, if
CONFIG_QEMU is found to indicate an existing directory, this directory
will be used rather than cloning the git repo into the build tree.

[ This changeset is the xen-unstable part of the patch but also
  includes the QEMU_TAG update to pull in the qemu part. -iwj ]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agox86 hvm: Fix VPMU issue on Nehalem cpus
Keir Fraser [Mon, 22 Nov 2010 19:16:34 +0000 (19:16 +0000)]
x86 hvm: Fix VPMU issue on Nehalem cpus

Fix an issue on Nehalem cpus where performance counter overflows may
lead to endless interrupt loops on this cpu.

Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
15 years agox86: Check for MWAIT in CPUID before using it in ACPI idle code.
Keir Fraser [Mon, 22 Nov 2010 19:13:00 +0000 (19:13 +0000)]
x86: Check for MWAIT in CPUID before using it in ACPI idle code.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agoevent_channel: Fix uninitialised variable build error.
Keir Fraser [Mon, 22 Nov 2010 08:29:03 +0000 (08:29 +0000)]
event_channel: Fix uninitialised variable build error.

Signed-off-by: Yang Zhang <yang.z.zhang@intel.com>
Signed-off-by: Keir Fraser <keir@xen.org>
15 years agoQEMU_TAG update
Ian Jackson [Fri, 19 Nov 2010 18:55:18 +0000 (18:55 +0000)]
QEMU_TAG update

15 years agostubdom: allow building with read-only sources
Jan Beulich [Fri, 19 Nov 2010 18:39:33 +0000 (18:39 +0000)]
stubdom: allow building with read-only sources

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agolibxc: fix tracing (broken with hypercall buffers)
Andre Przywara [Fri, 19 Nov 2010 18:15:14 +0000 (18:15 +0000)]
libxc: fix tracing (broken with hypercall buffers)

the attached patch makes Xen tracing work again, after the introduction
of the hypercall buffers broke it. Just a missing line.

Thanks to Uwe Dannowski for reporting this.

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agox86-64: Re-map all physdev_op struct names in compat shim.
Keir Fraser [Fri, 19 Nov 2010 13:49:54 +0000 (13:49 +0000)]
x86-64: Re-map all physdev_op struct names in compat shim.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agoIntroduce PHYSDEVOP_get_free_pirq
Keir Fraser [Fri, 19 Nov 2010 13:45:08 +0000 (13:45 +0000)]
Introduce PHYSDEVOP_get_free_pirq

Introduce a new physdev_op called PHYSDEVOP_get_free_pirq to allow a
guest to get a free pirq number from Xen; the hypervisor would keep
that pirq free for the guest to use in a mapping.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
15 years agoInterrupt remapping to PIRQs in HVM guests
Keir Fraser [Fri, 19 Nov 2010 13:43:24 +0000 (13:43 +0000)]
Interrupt remapping to PIRQs in HVM guests

This patch allows HVM guests to remap interrupts and MSIs into pirqs;
once the mapping is in place the guest will receive the interrupt (or
the MSI) as an event.  The interrupt to be remapped can either be an
interrupt of an emulated device or an interrupt of a passthrough
device and we keep track of that.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
15 years agoVPMU: Add the Intel CPU X7542 to the list of supported prcocessors
Keir Fraser [Fri, 19 Nov 2010 13:26:42 +0000 (13:26 +0000)]
VPMU: Add the Intel CPU X7542 to the list of supported prcocessors

Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
15 years agoconsolidate custom parameter parsing routines looking for boolean values
Keir Fraser [Fri, 19 Nov 2010 13:25:54 +0000 (13:25 +0000)]
consolidate custom parameter parsing routines looking for boolean values

Have a single function for this, rather than doing the same in half a
dozen places.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
15 years agox86-64: one more adjustment for Fam10 MMCONF enabling
Keir Fraser [Fri, 19 Nov 2010 13:24:00 +0000 (13:24 +0000)]
x86-64: one more adjustment for Fam10 MMCONF enabling

The BASE_VALID() macro needs adjustment to match the other changes
done to the original Linux code (which are all queued to be merged
into Linus' tree), plus per Hypertransport specification the range
0xff00000000-0xffffffffff also needs to be excluded.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
15 years agoiommu: adjust section annotations in pass-through code
Keir Fraser [Fri, 19 Nov 2010 13:23:34 +0000 (13:23 +0000)]
iommu: adjust section annotations in pass-through code

Most importantly, anything Dom0 construction related can be __init.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
15 years agox86/mm: Allocate log-dirty bitmaps from shadow/HAP memory.
Keir Fraser [Fri, 19 Nov 2010 13:21:09 +0000 (13:21 +0000)]
x86/mm: Allocate log-dirty bitmaps from shadow/HAP memory.

Move the p2m alloc and free functions back into the per-domain paging
assistance structure and allow them to be called from the log-dirty
code.  This makes it less likely that log-dirty code will run out of
memory populating the log-dirty bitmap.

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
15 years agox86/mm: don't override an existing shadow memory allocation when
Keir Fraser [Fri, 19 Nov 2010 13:20:48 +0000 (13:20 +0000)]
x86/mm: don't override an existing shadow memory allocation when
enabling log-dirty shadows on a PV guest.

Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
15 years agox86 hvm: Refuse to perform __hvm_copy() work in atomic context.
Keir Fraser [Thu, 18 Nov 2010 12:28:31 +0000 (12:28 +0000)]
x86 hvm: Refuse to perform __hvm_copy() work in atomic context.

Soon we will properly handle paged out memory in this function by
sleeping in hypervisor context. This will require that all callers can
sleep.

If this check is too strong, we can reduce it to only applying to
guests with paging enabled (which also currently implies only guests
using Intel EPT). However my brief testing seems to indicate it works
okay.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agorcu_lock(current->domain) does not need to disable preemption.
Keir Fraser [Thu, 18 Nov 2010 12:26:27 +0000 (12:26 +0000)]
rcu_lock(current->domain) does not need to disable preemption.

If the guest sleeps in hypervisor context, it should not be destroyed
until execution reaches a safe point (i.e., guest context). This is
not implemented yet. :-) But the next patch will rely on it, to allow
an HVM guest to execute hypercalls that indirectly invoke __hvm_copy()
within an rcu_lock_current_domain() region.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agoDefine Linux-style <preempt.h> interface.
Keir Fraser [Thu, 18 Nov 2010 11:45:33 +0000 (11:45 +0000)]
Define Linux-style <preempt.h> interface.

Use it to disable sleeping in spinlock and rcu-read regions.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agowaitqueue: Add license info to source file.
Keir Fraser [Thu, 18 Nov 2010 11:44:40 +0000 (11:44 +0000)]
waitqueue: Add license info to source file.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agox86_64: Fix booting 32-bit dom0
Keir Fraser [Wed, 17 Nov 2010 20:40:30 +0000 (20:40 +0000)]
x86_64: Fix booting 32-bit dom0

dom0/vcpu0 was not getting allocated a hypercall xlat area.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agox86/hvm/pmtimer: improving scalability of virtual time update
Keir Fraser [Wed, 17 Nov 2010 17:28:17 +0000 (17:28 +0000)]
x86/hvm/pmtimer: improving scalability of virtual time update

Mitigate the heavy contention on handle_pmt_io when running a HVM
configured with many cores (e.g., 32 cores). As the virtual time of a
domain must be fresh, there should be someone updating it,
periodically. But it is not necessary to let a VCPU update the virtual
time when another one has been updating it. Thus the update can be
skipped when the VCPU finds someone else is updating the virtual time.
So every time a VCPU invoke handle_pmt_io to update the current
domain's virtual time, it will first try to acquire the pmtimer lock.
If it succeeds, it will update the virtual time. Otherwise, it can
skip the update, waits for the pmtimer lock holder to finish updating
the virtual time and returns the updated time.

Signed-off-by: Xiang Song <xiangsong@fudan.edu.cn>
Signed-off-by: Keir Fraser <keir@xen.org>
15 years agoWait queues, allowing conditional sleep in hypervisor context.
Keir Fraser [Wed, 17 Nov 2010 16:42:37 +0000 (16:42 +0000)]
Wait queues, allowing conditional sleep in hypervisor context.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agoAdd locking-depth debugging, introduce in_atomic() boolean.
Keir Fraser [Tue, 16 Nov 2010 15:41:28 +0000 (15:41 +0000)]
Add locking-depth debugging, introduce in_atomic() boolean.

This will be useful for debugging use of sleep-in-hypervisor
primitives.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agox86_64: Make 32-bit-hypercall translate area per-vcpu.
Keir Fraser [Tue, 16 Nov 2010 14:16:36 +0000 (14:16 +0000)]
x86_64: Make 32-bit-hypercall translate area per-vcpu.

This is a prerequisite for allowing guest descheduling within a
hypercall.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agox86: Clean up vcpu initialisation (especially xsave save area)
Keir Fraser [Tue, 16 Nov 2010 14:09:13 +0000 (14:09 +0000)]
x86: Clean up vcpu initialisation (especially xsave save area)

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agoMake multicall state per-vcpu rather than per-cpu
Keir Fraser [Tue, 16 Nov 2010 13:01:43 +0000 (13:01 +0000)]
Make multicall state per-vcpu rather than per-cpu

This is a prerequisite for allowing guest descheduling within a
hypercall.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agox86 hvm: Make a couple of hypercall state flags per-vcpu
Keir Fraser [Tue, 16 Nov 2010 12:42:35 +0000 (12:42 +0000)]
x86 hvm: Make a couple of hypercall state flags per-vcpu

This is a prerequisite for allowing guest descheduling within a
hypercall.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agox86 hvm vpmu: Print error if VPMU cannot be init'ed on this CPU.
Keir Fraser [Tue, 16 Nov 2010 11:44:09 +0000 (11:44 +0000)]
x86 hvm vpmu: Print error if VPMU cannot be init'ed on this CPU.

Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Signed-off-by: Keir Fraser <keir@xen.org>
15 years agoamd iommu: Fix HV crash with 32bit pv_ops kernel
Keir Fraser [Tue, 16 Nov 2010 11:28:33 +0000 (11:28 +0000)]
amd iommu: Fix HV crash with 32bit pv_ops kernel

Signed-off-by: Wei Wang <wei.wang2@amd.com>
Tested-by: Conny Seidel <conny.seidel@amd.com>
15 years agox2apic: Remove a panic condition in enabling x2APIC
Keir Fraser [Mon, 15 Nov 2010 09:31:38 +0000 (09:31 +0000)]
x2apic: Remove a panic condition in enabling x2APIC

Currently Xen triggers a panic if user disables VT-d by command line
while not disable x2APIC. This requires users to specify both
"iommu=0" and "x2apic=0" to disable VT-d if the platform supports
x2APIC. It's not user friendly. This patch removes the panic
condition. That's to say, don't require user to specify "x2apic=0"
when specify "iommu=0". As long as VT-d is not enabled (disabled in
BIOS or in command line), x2APIC won't be enabled naturally (x2APIC
depends on VT-d Interrupt remapping).

Signed-off-by: Weidong Han <weidong.han@intel.com>
15 years agox86 xsave: Adding back CPUID support for Xsave (version 2)
Keir Fraser [Mon, 15 Nov 2010 09:27:53 +0000 (09:27 +0000)]
x86 xsave: Adding back CPUID support for Xsave (version 2)

XSave support via CPUID virtualization for both PV and HVM guests.

Signed-off-by: Shan Haitao <haitao.shan@intel.com>
15 years agolibxc: correct dirty_bitmap bounce size in xc_hvm_track_dirty_vram
Ian Campbell [Wed, 10 Nov 2010 14:56:06 +0000 (14:56 +0000)]
libxc: correct dirty_bitmap bounce size in xc_hvm_track_dirty_vram

The size should be in bytes not 32-bit words. Fixes graphics
corruption issues for HVM guests due to bouncing too little data.

Also the dirty_bitmap buffer is output only and therefore only needs
bouncing in one direction.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agotools/hotplug/Linux: supply --physdev-is-bridged in iptables runes
Sander Eikelenboom [Wed, 10 Nov 2010 14:37:19 +0000 (14:37 +0000)]
tools/hotplug/Linux: supply --physdev-is-bridged in iptables runes

With newer (pvops) kernels logs get flooded with this iptables
warning: physdev match: using --physdev-out in the OUTPUT, FORWARD and
POSTROUTING chains for non-bridged traffic is not supported anymore

Using the --physdev-is-bridged option prevents this.
See also: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=571634#10

Signed-off-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agohvmloader: Fix 22383:cba667fb80cf iterating over defns 0..255
Keir Fraser [Wed, 10 Nov 2010 14:15:23 +0000 (14:15 +0000)]
hvmloader: Fix 22383:cba667fb80cf iterating over defns 0..255

We need to declare devfn as wider than 8 bits for a loop 0<devfn<256
to terminate.

Signed-off-by: Keir Fraser <keir@xen.org>
15 years agohvmloader: fix off-by-one-bit error when initialising PCI devices
Keir Fraser [Wed, 10 Nov 2010 13:58:16 +0000 (13:58 +0000)]
hvmloader: fix off-by-one-bit error when initialising PCI devices

hvmloader is responsible for - amoungst other things - initialising
the PCI device BARs prior to loading the guest BIOS.  The previous
code only probed for devfn up to 128.  The lower 3 bits are function
IDs so this meant that only devices in slots 0-15 were actually being
initialized.

Signed-off-by: Alex Zeffertt <alex.zeffertt@eu.citrix.com>
Acked-by: Gianni Tedesco <gianni.tedesco@citrix.com>
15 years agohvmloader: Fix acpi static tables for new ACPI ioports location.
Keir Fraser [Tue, 9 Nov 2010 20:37:46 +0000 (20:37 +0000)]
hvmloader: Fix acpi static tables for new ACPI ioports location.

This change some fadt values -- the address of the acpi ioports -- and
the pm1a_evt_address value wrote for the pci bus.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
15 years agotools: provide explicit target for refetching/resetting qemu
Ian Campbell [Tue, 9 Nov 2010 18:15:25 +0000 (18:15 +0000)]
tools: provide explicit target for refetching/resetting qemu

This patch adds an explicit update mechanism:
  make tools/ioemu-dir-force-update
This isn't brilliant but is better than doing "cd tools/ioemu-remote
&& git reset --hard <sha1...>" by hand.

Note that invoking this target will destroy all working tree changes
made to qemu-xen.

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agofirmware, qemu: Change ACPI IO values to match QEMU BIOS
Anthony Perard [Tue, 9 Nov 2010 18:03:55 +0000 (18:03 +0000)]
firmware, qemu: Change ACPI IO values to match QEMU BIOS

As part of the QEMU/Xen merge, this patch comes to change the value of
sleep states and add some information in the PCI registers to match the
implementation of the BIOS of QEMU.

It also does a hypercall (HVM_PARAM_ACPI_IOPORTS_LOCATION) that tell the
Xen to use the new Port I/O instead of the old one.

[ Also, in this patch, update QEMU_TAG to the qemu-xen revision
  with the corresponding qemu change. -iwj ]

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agotools: libxl: fix cpuid compilation errors for ia64
KUWAMURA Shin'ya [Tue, 9 Nov 2010 17:43:12 +0000 (17:43 +0000)]
tools: libxl: fix cpuid compilation errors for ia64

ia64 does not have cpuid.  So break out cpuid-related functions into a
separate file, with stubs for ia64.

Signed-off-by: KUWAMURA Shin'ya <kuwa@jp.fujitsu.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agotools: libxl: Fix some "const const"s introduced in 711cb4229900
Ian Jackson [Tue, 9 Nov 2010 12:00:05 +0000 (12:00 +0000)]
tools: libxl: Fix some "const const"s introduced in 711cb4229900

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
15 years agoia64: fix the build (again)
Keir Fraser [Tue, 9 Nov 2010 11:50:37 +0000 (11:50 +0000)]
ia64: fix the build (again)

Signed-off-by: Jan Beulich <jbeulich@novell.com>